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Abstract. In this note, we give a construction that provides a tight 
lower bound of mn — 1 for the length of the shortest word in the inter- 
section of two regular languages with state complexities m and n. 

1 Introduction 

Maslov observed that the state complexity of the intersection of two regular 
languages that have state complexities m and n has an upper bound of mn [2] . 
One can easily verify this result using the usual cross-product construction [1, p. 
59] . This means that the shortest word in such an intersection cannot be longer 
than mn — 1. It is natural to wonder if this bound is the best possible, over a 
fixed alphabet size, for every choice of m and n. Here we show that there is a 
matching lower bound. 

First we define some notation. A deterministic finite automaton (DFA) is a 
quintuple (Q, S, 6, qo, A) where Q is the finite set of states, E is the finite input 
alphabet, 5 : Q x S — > Q is the transition function, qq £ Q is the initial state, 
and A C Q is the set of accepting states. For a DFA M, L(M) denotes the 
language accepted by M, For any x £ £*, \x\ denotes the length of x, and \x\ a 
for some ne£ denotes the number of occurrences of a in x. We also define two 
maps from nonempty languages to N as follows. For a nonempty language L, 
let lss(L) denote the length of the shortest word in L. If L is regular, then we 
let sc(L) denote the state complexity of L (the minimal number of states in any 
DFA accepting L). 

We previously stated that the upper bound on the state complexity of the 
intersection of two regular languages implies an upper bound the length of the 
shortest word in the intersection. More precisely, we have lss(L) < sc(L), which 
follows directly from the pumping lemma for regular languages [1, p. 55]. So all 
that is left is to show that the upper bound of mn — 1 can actually be attained 
for all m and n. There is an obvious construction over a unary alphabet that 
works when gcd(m,n) = 1: namely, set 

— L\ = {x : \x\ = m — 1 (mod m)}, and 

— L2 = {x : \x\ = n — 1 (mod n)}. 

However, this construction fails when gcd(m,n) ^ 1, so we provide a more 
general construction over a binary alphabet that works for all m and n. 



2 Our result 



Proposition 1. For all integers m, n > 1 there exist DFAs Mi, M 2 with to and 
n states, respectively, such that L{M{) C\ L(M 2 ) ^ 0, and lss(L(Mi) n L(M 2 )) = 
mn — 1. 

Proof. The proof is constructive. Without loss of generality, assume to < n, 
and set E = {0,1}. Let Mi be the DFA given by (Q u E, Si,p , A{), where 
Qi = {Po,Pi,P2, ■ ■ ■ ,Pm-i}, A\ = po, and for each a, < a < m — 1, and 
c e {0, 1} we set 

Sl(Pa,c) =p {a+c)modm . (1) 

Then 

i(Mi) ={ier: |x|i = (mod to)}. 

Let M 2 be the DFA (Q 2 , E, 62^0,^2), illustrated in Figure 1, where Q 2 = 
{50,91,92, ■ • ■ ,q n -i}, M = q n -i, and for each a, < a < n - 1, 



^2(<7a,c) 



<Za+c, 

9(o+l) mod : 
90, 



ifO<a<TO — 1; 

if c = and m — 1 < a < n — 1; 

if c = 1 and to — l<a<n — 1. 








Fig. 1. The DFA M 2 . 

Focussing solely on the l's that appear in some accepting path in M 2 , we see 
that we can return to q 

(a) via a simple path with m l's, or 

(b) (if we go through g„_i), via a simple path with m — 1 l's and ending in the 
transition 5(q n -i,0) = qo- 
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After some number of cycles through q , we eventually arrive at q n -i- Letting i 
denote the number of times a path of type (b) is chosen (including the last path 
that arrives at q n -i) and j denote the number of times a path of type (a) is 
chosen, we see that the number of l's in any accepted word must be of the form 
i(m — 1) + jm, with i > 0, j > 0. The number of 0's along such a path is then 
at least i(n — m + 1) — 1, with the —1 in this expression arising from the fact 
that the last part of the path terminates at q n -\ without taking an additional 
transition back to q . 
Thus 

L(M 2 ) C {x G S* : 3i,j G N, such that i > 0, j > 0, and 
x|i = i(m — 1) + jm, \x\o > i(n — m + 1) — 1}. 

Furthermore, for every i,j G N, such that i > 0,j > 0, there exists an x G 
L(M 2 ) such that x|i = i(m — 1) + jm, and |x| = i(n — m + 1) — 1. This is 
obtained, for example, by cycling j times from q n to g TO _i and then back to q 
via a transition on 1, then j — 1 times from qa to <7 n -i and then back to qo via 
a transition on 0, and finally one more time from qo to q n -i- 
It follows then that 

L(M 1 n M 2 ) C {x e S* : 3i,j G N, such that i > 0, j > 0, and 
x|i = i(m — 1) + jm, |x|o > i(n — m + 1) — 1 
and i(m — 1) + jm = (mod m)}. 

Further, for every such i and j, there exists a corresponding element in L(M\ n 
M2). Since m — 1 and m are relatively prime, the shortest such word corresponds 
to i = m, j = 0, and satisfies x|o = m(n — m + 1) — 1. In particular, a shortest 
accepted word is (i™-io™-™+ 1 ) m - 1 l m - 1 0"- m , which is of length mn-1. □ 

It is natural to try to extend the construction to an arbitrary number of 
DFAs. However, we have found empirically that, over a two-letter alphabet, 
the corresponding bound mnp — 1 for three DFA's does not always hold. For 
example, there are no DFA's of 2, 2, and 3 states for which the shortest word in 
the intersection is of length 2 • 2 • 3 — 1. 
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